Trying to have another look through our results for anything neurodegeneration-related, either at the level of phenotype or disease.

Import results

res <- MultiEWCE::load_example_results()
## Registered S3 method overwritten by 'ggtree':
##   method         from     
##   fortify.igraph ggnetwork
res <- HPOExplorer::add_disease(res)
## Annotating phenos with Disease
## Importing existing file: ... phenotype.hpoa
annot <- HPOExplorer::load_phenotype_to_genes(3)
## Importing existing file: ... phenotype.hpoa

Filter results

First, we do some initial filtering of the results to only get significant associations.

res <- res[q<0.05 & symptoms.pval<0.05,]

Alzheimer’s

AD is not enriched for any phenotype, likely due to the small number of genes included. GWAS results are more comprehensive than this. https://hpo.jax.org/app/browse/term/HP:0002511

ad <- res[grepl("alzheimer",Phenotype,ignore.case = TRUE)]
nrow(ad) 
## [1] 0

Parkinson’s

Phenotypes

“Parkinson’s disease” is not in the HPO, but the branch “Parkinsonism” is. https://hpo.jax.org/app/browse/term/HP:0001300

We actually do have 1 sig phenotype here: “Parkinsonism”

pd_phenos <- HPOExplorer::make_phenos_dataframe(ancestor="Parkinsonism")
## Importing existing file: ... phenotype_to_genes.txt
## Extracting data for 2 descendents.
## Computing gene counts.
## Getting absolute ontology level for 2 HPO IDs.
## Computing ontology level / gene count ratio.
## Adding information_content scores.
## Adding term definitions.
## Adding level-3 ancestor to each HPO ID.
## Making hoverboxes from: 'Phenotype', 'HPO_ID', 'ontLvl', 'ontLvl_geneCount_ratio', 'definition', 'ancestor', 'ancestor_name'
pd1 <- res[HPO_ID %in% pd_phenos$HPO_ID ]

MultiEWCE::create_dt(pd1)
## Loading required namespace: DT

Diseases

We can also search for phenotypes via “Parkinson’s” as a disease.

This gives us lots of phenotypes, but many of them are not specific to PD: e.g. Autosomal recessive inheritance, Open mouth, Macrocephaly, Urinary urgency

d <- annot[grepl("parkinson",DiseaseName, ignore.case = TRUE)]
pd <- res[HPO_ID %in% d$HPO_ID] 
nrow(pd)
## [1] 9928

Let’s try to narrow it down to very PD-specific phenotypes.

These PD-associated phenotypes only appear once in the HPO annotations. Thus, they must be PD-specific.

That leaves us with just 2 phenotypes.

n_pd_terms <- length(unique(pd_phenos$HPO_ID))
pd_pheno_freqs <- sort(table(annot[HPO_ID %in% d$HPO_ID]$HPO_ID))
hist(pd_pheno_freqs, 50)

pd_specific <- pd[HPO_ID %in% names(pd_pheno_freqs)[pd_pheno_freqs<=1+n_pd_terms]]

MultiEWCE::create_dt(pd_specific)

We can lift this filter a bit and look for PD-associated phenotypes that only appear in 3 or less non-PD diseases. We see phenotype that can indeed be associated with PD, but are not necessarily exclusive to it: e.g.
Abnormal aggressive, impulsive or violent behavior, Pill-rolling tremor, “Abnormal CSF protein level

pd_specific2 <- pd[HPO_ID %in% names(pd_pheno_freqs)[pd_pheno_freqs<=3+n_pd_terms] ]

MultiEWCE::create_dt(pd_specific2)

Mental deterioration

Get all the phenos from the Mental deterioration branch. This includes many forms of dementia https://hpo.jax.org/app/browse/term/HP:0001268

As we’ve seen before, only the higher-level term “Mental deterioration” remains significant. Perhaps due to the greater number of genes in this higher level.

phenos <- HPOExplorer::make_phenos_dataframe(ancestor="Mental deterioration")
## Importing existing file: ... phenotype_to_genes.txt
## Extracting data for 14 descendents.
## Computing gene counts.
## Getting absolute ontology level for 14 HPO IDs.
## Computing ontology level / gene count ratio.
## Adding information_content scores.
## Adding term definitions.
## Adding level-3 ancestor to each HPO ID.
## Making hoverboxes from: 'Phenotype', 'HPO_ID', 'ontLvl', 'ontLvl_geneCount_ratio', 'definition', 'ancestor', 'ancestor_name'
md <- res[HPO_ID %in% phenos$HPO_ID]

MultiEWCE::create_dt(md)

Atrophy/Degeneration affecting the central nervous system

HPO has the term: Atrophy/Degeneration affecting the central nervous system https://hpo.jax.org/app/browse/term/HP:0007367

In our results, we found a number of significantly enriched phenotypes that fall under that branch:

phenos <- HPOExplorer::make_phenos_dataframe(ancestor="Atrophy/Degeneration affecting the central nervous system")
## Importing existing file: ... phenotype_to_genes.txt
## Extracting data for 41 descendents.
## Computing gene counts.
## Getting absolute ontology level for 41 HPO IDs.
## Computing ontology level / gene count ratio.
## Adding information_content scores.
## Adding term definitions.
## Adding level-3 ancestor to each HPO ID.
## Making hoverboxes from: 'Phenotype', 'HPO_ID', 'ontLvl', 'ontLvl_geneCount_ratio', 'definition', 'ancestor', 'ancestor_name'
at <- res[HPO_ID %in% phenos$HPO_ID]
MultiEWCE::create_dt(at)

Multi-trait intersection

dementia/neurodegeneration vs. diabetes/obesity

Next, let’s find examples of celltype-specific enrichment results that relate to diseases/phenotypes in the categories of dementia/neurodegeneration vs. diabetes/obesity.

Query results

queries <- paste("alzheimer","parkinson","diabetes","obesity",sep = "|")
phenos <- rbind(
  HPOExplorer::make_phenos_dataframe(ancestor="Atrophy/Degeneration affecting the central nervous system"),
  HPOExplorer::make_phenos_dataframe(ancestor="Mental deterioration"),
  HPOExplorer::make_phenos_dataframe(ancestor="Diabetes mellitus"),
  HPOExplorer::make_phenos_dataframe(ancestor="Obesity")
)
## Importing existing file: ... phenotype_to_genes.txt
## Extracting data for 41 descendents.
## Computing gene counts.
## Getting absolute ontology level for 41 HPO IDs.
## Computing ontology level / gene count ratio.
## Adding information_content scores.
## Adding term definitions.
## Adding level-3 ancestor to each HPO ID.
## Making hoverboxes from: 'Phenotype', 'HPO_ID', 'ontLvl', 'ontLvl_geneCount_ratio', 'definition', 'ancestor', 'ancestor_name'
## Importing existing file: ... phenotype_to_genes.txt
## Extracting data for 14 descendents.
## Computing gene counts.
## Getting absolute ontology level for 14 HPO IDs.
## Computing ontology level / gene count ratio.
## Adding information_content scores.
## Adding term definitions.
## Adding level-3 ancestor to each HPO ID.
## Making hoverboxes from: 'Phenotype', 'HPO_ID', 'ontLvl', 'ontLvl_geneCount_ratio', 'definition', 'ancestor', 'ancestor_name'
## Importing existing file: ... phenotype_to_genes.txt
## Extracting data for 11 descendents.
## Computing gene counts.
## Getting absolute ontology level for 11 HPO IDs.
## Computing ontology level / gene count ratio.
## Adding information_content scores.
## Adding term definitions.
## Adding level-3 ancestor to each HPO ID.
## Making hoverboxes from: 'Phenotype', 'HPO_ID', 'ontLvl', 'ontLvl_geneCount_ratio', 'definition', 'ancestor', 'ancestor_name'
## Importing existing file: ... phenotype_to_genes.txt
## Extracting data for 4 descendents.
## Computing gene counts.
## Getting absolute ontology level for 4 HPO IDs.
## Computing ontology level / gene count ratio.
## Adding information_content scores.
## Adding term definitions.
## Adding level-3 ancestor to each HPO ID.
## Making hoverboxes from: 'Phenotype', 'HPO_ID', 'ontLvl', 'ontLvl_geneCount_ratio', 'definition', 'ancestor', 'ancestor_name'
res_inter <- res[grepl(queries,Phenotype,ignore.case = TRUE) | 
                 grepl(queries,DiseaseName,ignore.case = TRUE) |
                   HPO_ID %in% phenos$HPO_ID] 

message(formatC(nrow(res_inter),big.mark = ",")," results remain.")
## 511 results remain.

Filter results

We still have a lot of results left (500+ rows), which is a bit much to visualize. So let’s do some additional filtering.

res_filt <- MultiEWCE::prioritise_targets(results = res_inter, 
                                          keep_tiers = NULL,
                                          severity_threshold = NULL,
                                          keep_onsets = NULL, 
                                          keep_deaths = NULL,
                                          pheno_ndiseases_threshold = NULL,
                                          symptom_p_threshold = NULL, 
                                          symptom_intersection_size_threshold = 2)
## Prioritising gene targets.
## Adding HPO IDs.
## Importing existing file: ... phenotype_to_genes.txt
## Translating all phenotypes to HPO IDs.
## + Returning a vector of phenotypes (same order as input).
## Adding term definitions.
## Prioritised targets: step='start' 
##  - Rows: 511 
##  - Phenotypes: 96 
##  - Diseases: 71 
##  - Cell types: 32
## Filtering @ q-value <= 0.05
## Prioritised targets: step='q_threshold' 
##  - Rows: 511 
##  - Phenotypes: 96 
##  - Diseases: 71 
##  - Cell types: 32
## Filtering @ fold-change >= 1
## Prioritised targets: step='fold_threshold' 
##  - Rows: 511 
##  - Phenotypes: 96 
##  - Diseases: 71 
##  - Cell types: 32
## Adding level-3 ancestor to each HPO ID.
## Removing remove descendants of: 'Clinical course'
## Translating all phenotypes to HPO IDs.
## + Returning a dictionary of phenotypes (different order as input).
## Prioritised targets: step='remove_descendants' 
##  - Rows: 510 
##  - Phenotypes: 95 
##  - Diseases: 71 
##  - Cell types: 32
## Getting absolute ontology level for 95 HPO IDs.
## Prioritised targets: step='keep_ont_levels' 
##  - Rows: 510 
##  - Phenotypes: 95 
##  - Diseases: 71 
##  - Cell types: 32
## Annotating phenos with Onset.
## Prioritised targets: step='keep_onsets' 
##  - Rows: 510 
##  - Phenotypes: 95 
##  - Diseases: 71 
##  - Cell types: 32
## Annotating phenos with AgeOfDeath
## Prioritised targets: step='keep_deaths' 
##  - Rows: 510 
##  - Phenotypes: 95 
##  - Diseases: 71 
##  - Cell types: 32
## Annotating phenos with Tiers.
## Prioritised targets: step='keep_tiers' 
##  - Rows: 510 
##  - Phenotypes: 95 
##  - Diseases: 71 
##  - Cell types: 32
## Annotating phenos with Modifiers
## Prioritised targets: step='severity_threshold' 
##  - Rows: 510 
##  - Phenotypes: 95 
##  - Diseases: 71 
##  - Cell types: 32
## Annotating phenos with n_diseases
## Importing existing file: ... phenotype_to_genes.txt
## Importing existing file: ... genes_to_phenotype.txt
## Importing existing file: ... phenotype.hpoa
## Prioritised targets: step='pheno_ndiseases_threshold' 
##  - Rows: 510 
##  - Phenotypes: 95 
##  - Diseases: 71 
##  - Cell types: 32
## Annotating phenotype frequencies.
## Prioritised targets: step='pheno_frequency_threshold' 
##  - Rows: 510 
##  - Phenotypes: 95 
##  - Diseases: 71 
##  - Cell types: 32
## Prioritised targets: step='symptom_p_threshold' 
##  - Rows: 510 
##  - Phenotypes: 95 
##  - Diseases: 71 
##  - Cell types: 32
## Prioritised targets: step='symptom_intersection_size_threshold' 
##  - Rows: 501 
##  - Phenotypes: 87 
##  - Diseases: 65 
##  - Cell types: 30
## 21 / 30 of cell types kept.
## Prioritised targets: step='keep_celltypes' 
##  - Rows: 372 
##  - Phenotypes: 81 
##  - Diseases: 56 
##  - Cell types: 21
## Filtering by gene size.
## Converting phenos to GRanges.
## Translating all phenotypes to HPO IDs.
## + Returning a vector of phenotypes (same order as input).
## Loading required namespace: ensembldb
## Gathering gene metadata
## Loading required namespace: EnsDb.Hsapiens.v75
## Prioritised targets: step='keep_seqnames' 
##  - Rows: 41,687 
##  - Phenotypes: 81 
##  - Genes: 3,739
## 246 / 3,739 genes kept.
## Prioritised targets: step='gene_size' 
##  - Rows: 2,315 
##  - Phenotypes: 79 
##  - Genes: 246
## Prioritised targets: step='keep_biotypes' 
##  - Rows: 2,315 
##  - Phenotypes: 79 
##  - Genes: 246
## Filtering by mean_exp_quantile.
## Annotating gene frequencies.
## Importing existing file: ... genes_to_phenotype.txt
## Prioritised targets: step='gene_frequency_threshold' 
##  - Rows: 11,393 
##  - Phenotypes: 79 
##  - Diseases: 56 
##  - Cell types: 21 
##  - Genes: 246
## Prioritised targets: step='keep_specificity_quantiles' 
##  - Rows: 11,393 
##  - Phenotypes: 79 
##  - Diseases: 56 
##  - Cell types: 21 
##  - Genes: 246
## Prioritised targets: step='keep_mean_exp_quantiles' 
##  - Rows: 8,851 
##  - Phenotypes: 79 
##  - Diseases: 56 
##  - Cell types: 21 
##  - Genes: 200
## Prioritised targets: step='symptom_gene_overlap' 
##  - Rows: 148 
##  - Phenotypes: 34 
##  - Diseases: 16 
##  - Cell types: 7 
##  - Genes: 13
## Sorting rows.
## Prioritised targets: step='end' 
##  - Rows: 148 
##  - Phenotypes: 34 
##  - Diseases: 16 
##  - Cell types: 7 
##  - Genes: 13

Plot network

vn <- MultiEWCE::prioritise_targets_network(
  top_targets = res_filt$top_targets, 
  submain = "dementia/neurodegeneration & diabetes/obesity", 
  save_path = here::here("networks/dementia_diabetes_network.html"), 
  show_plot = FALSE)
## Loading required namespace: pals
## Loading required namespace: igraph
## Loading required namespace: tidygraph
## Creating network.
## Loading required namespace: visNetwork
## Creating plot.
## Saving plot ==> /Users/schilder/Desktop/ewce/RareDiseasePrioritisation/networks/dementia_diabetes_network.html
vn$plot

Session info

sessionInfo()
## R version 4.2.1 (2022-06-23)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Big Sur ... 10.16
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## loaded via a namespace (and not attached):
##   [1] utf8_1.2.3                    R.utils_2.12.2               
##   [3] tidyselect_1.2.0              RSQLite_2.3.0                
##   [5] AnnotationDbi_1.60.2          htmlwidgets_1.6.1            
##   [7] grid_4.2.1                    BiocParallel_1.32.5          
##   [9] munsell_0.5.0                 codetools_0.2-19             
##  [11] DT_0.27                       colorspace_2.1-0             
##  [13] Biobase_2.58.0                filelock_1.0.2               
##  [15] highr_0.10                    knitr_1.42                   
##  [17] rstudioapi_0.14               orthogene_1.4.1              
##  [19] stats4_4.2.1                  SingleCellExperiment_1.20.0  
##  [21] ggsignif_0.6.4                MatrixGenerics_1.10.0        
##  [23] GenomeInfoDbData_1.2.9        bit64_4.0.5                  
##  [25] rprojroot_2.0.3               coda_0.19-4                  
##  [27] vctrs_0.6.0                   treeio_1.23.1                
##  [29] generics_0.1.3                xfun_0.37                    
##  [31] BiocFileCache_2.6.1           R6_2.5.1                     
##  [33] GenomeInfoDb_1.34.9           pals_1.7                     
##  [35] AnnotationFilter_1.22.0       bitops_1.0-7                 
##  [37] cachem_1.0.7                  gridGraphics_0.5-1           
##  [39] DelayedArray_0.24.0           promises_1.2.0.1             
##  [41] BiocIO_1.8.0                  scales_1.2.1                 
##  [43] gtable_0.3.1                  tidygraph_1.2.3              
##  [45] ontologyPlot_1.6              ensembldb_2.22.0             
##  [47] rlang_1.1.0                   rtracklayer_1.58.0           
##  [49] rstatix_0.7.2                 lazyeval_0.2.2               
##  [51] dichromat_2.0-0.1             broom_1.0.4                  
##  [53] BiocManager_1.30.20           yaml_2.3.7                   
##  [55] reshape2_1.4.4                HPOExplorer_0.99.7           
##  [57] abind_1.4-5                   GenomicFeatures_1.50.4       
##  [59] ggnetwork_0.5.12              crosstalk_1.2.0              
##  [61] backports_1.4.1               httpuv_1.6.9                 
##  [63] tools_4.2.1                   ggplotify_0.1.0              
##  [65] statnet.common_4.8.0          ggplot2_3.4.1                
##  [67] ellipsis_0.3.2                gplots_3.1.3                 
##  [69] jquerylib_0.1.4               paintmap_1.0                 
##  [71] RColorBrewer_1.1-3            BiocGenerics_0.44.0          
##  [73] Rcpp_1.0.10                   plyr_1.8.8                   
##  [75] visNetwork_2.1.2              progress_1.2.2               
##  [77] zlibbioc_1.44.0               purrr_1.0.1                  
##  [79] RCurl_1.98-1.10               prettyunits_1.1.1            
##  [81] ggpubr_0.6.0                  S4Vectors_0.36.2             
##  [83] SummarizedExperiment_1.28.0   grr_0.9.5                    
##  [85] here_1.0.1                    magrittr_2.0.3               
##  [87] data.table_1.14.8             ProtGenerics_1.30.0          
##  [89] matrixStats_0.63.0            hms_1.1.2                    
##  [91] patchwork_1.1.2               mime_0.12                    
##  [93] evaluate_0.20                 xtable_1.8-4                 
##  [95] XML_3.99-0.13                 EWCE_1.7.2                   
##  [97] IRanges_2.32.0                MultiEWCE_0.1.4              
##  [99] compiler_4.2.1                biomaRt_2.54.0               
## [101] maps_3.4.1                    tibble_3.2.0                 
## [103] KernSmooth_2.23-20            crayon_1.5.2                 
## [105] R.oo_1.25.0                   htmltools_0.5.4              
## [107] ggfun_0.0.9                   later_1.3.0                  
## [109] tidyr_1.3.0                   aplot_0.1.10                 
## [111] DBI_1.1.3                     ExperimentHub_2.6.0          
## [113] gprofiler2_0.2.1              dbplyr_2.3.1                 
## [115] rappdirs_0.3.3                babelgene_22.9               
## [117] EnsDb.Hsapiens.v75_2.99.0     Matrix_1.5-3                 
## [119] car_3.1-1                     piggyback_0.1.4              
## [121] cli_3.6.0                     R.methodsS3_1.8.2            
## [123] igraph_1.4.1                  parallel_4.2.1               
## [125] GenomicRanges_1.50.2          pkgconfig_2.0.3              
## [127] GenomicAlignments_1.34.1      plotly_4.10.1                
## [129] xml2_1.3.3                    ggtree_3.6.2                 
## [131] bslib_0.4.2                   XVector_0.38.0               
## [133] GeneOverlap_1.34.0            yulab.utils_0.0.6            
## [135] stringr_1.5.0                 digest_0.6.31                
## [137] graph_1.76.0                  Biostrings_2.66.0            
## [139] rmarkdown_2.20.1              HGNChelper_0.8.1             
## [141] tidytree_0.4.2                restfulr_0.0.15              
## [143] curl_5.0.0                    shiny_1.7.4                  
## [145] Rsamtools_2.14.0              gtools_3.9.4                 
## [147] rjson_0.2.21                  lifecycle_1.0.3              
## [149] nlme_3.1-162                  jsonlite_1.8.4               
## [151] carData_3.0-5                 network_1.18.1               
## [153] mapproj_1.2.11                viridisLite_0.4.1            
## [155] limma_3.54.2                  fansi_1.0.4                  
## [157] pillar_1.8.1                  ontologyIndex_2.10           
## [159] lattice_0.20-45               homologene_1.4.68.19.3.27    
## [161] KEGGREST_1.38.0               fastmap_1.1.1                
## [163] httr_1.4.5                    interactiveDisplayBase_1.36.0
## [165] glue_1.6.2                    RNOmni_1.0.1                 
## [167] png_0.1-8                     ewceData_1.7.1               
## [169] BiocVersion_3.16.0            bit_4.0.5                    
## [171] Rgraphviz_2.42.0              stringi_1.7.12               
## [173] sass_0.4.5                    blob_1.2.3                   
## [175] AnnotationHub_3.6.0           caTools_1.18.2               
## [177] memoise_2.0.1                 dplyr_1.1.0                  
## [179] ape_5.7-1